Re: plperl doesn't release memory - Mailing list pgsql-general

From Dan Sugalski
Subject Re: plperl doesn't release memory
Date
Msg-id a06210200be71b8149204@[172.24.18.155]
Whole thread Raw
In response to Re: plperl doesn't release memory  ("GIROIRE Nicolas (COFRAMI)" <nicolas.giroire@airbus.com>)
List pgsql-general
At 8:38 AM +0200 3/31/05, GIROIRE Nicolas (COFRAMI) wrote:
Hi,
I work with William.
In fact, we have already done the procedure in pl/pgsql but it is too slow and we use array which are native in perl.
The procedure is recursive, and use request on postgreSQL.
According to the evolution of memory use, it seems that no memory is free. I think that comes from the fact we have a recursive procedure.
The execution of the procedure take 3 hours and finishes already by an out of memory.
Can we oblige pl/perl to free memory for variable ?
Or can we configure postgresql to accept this rise in load ?
Or another idea ?

Perl generally frees things up as soon as they're no longer used, but there are a few cases where you'll run into trouble.

The first is the circular reference problem -- because perl uses reference counting, circular data structures won't ever die on their own, something you'll need to watch out for.

You need to make sure the variables actually go out of scope. With a recursive procedure this is a definite worry, since perl cleans up when variables go out of scope, and that doesn't happen until a sub actually exits.

Perl also does some optimistic caching as a performance booster, which is generally a win but sometimes it isn't. While perl cleans up the contents of variables, it leaves the structure in place for arrays and hashes for subs. (Though only once for each sub, so this doesn't get nuts for recursive invocations of a subroutine) Not normally a problem, but if you've got a 100M element array the bits add up.

Finally, make sure you're using a relatively recent perl, one of the 5.8 versions. There were some bugs relating to closures that got patched up -- earlier versions had some reference count issues there so closures and their contents tended not to ever get cleaned up.

*Assuming* you're not actually leaking data with circular structures and the like, or throwing massive amounts of data into globals, there are a few things you can do to keep your memory usage in line.

1) Do *not* pass in large arrays or hashes as parameters. Use references to them instead, to avoid perl's parameter flattening
2) Kill your data yourself when you're done with it by undef()ing the variables. (Do *not* assign in empty lists, or empty strings. That isn't enough) "undef @foo", for example, will completely clean out the @foo array, and leave you with a variable that only takes up 56 bytes or so.
3) Try and keep the number of hash keys you use relatively low. (Not normally an issue, but once you start getting into millions of entries it adds up) Perl makes individual small allocations for hash keys and it tends to fragment the free list.

It might be worth a code review to see if you're doing things that are inefficient in general. That tends to be an issue when working with large data sets, since inefficiencies that don't matter with 100 (or 100K) records becomes an issue when you get into massive data sets.

You can also do some memory usage investigation with Devel::Size and some of the other Devel modules. (Though be warned that Devel::Size is pretty profligate itself with memory)

-----Message d'origine-----
De : pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org]De la part de Sean Davis
Envoyé : mercredi 30 mars 2005 17:01
À : FERREIRA William (COFRAMI)
Cc : Postgresql-General list
Objet : Re: [GENERAL] plperl doesn't release memory

As I understand it, a single execution of a pl/perl function will not
be affected by the perl memory issue, so I don't think that is your
problem.
My guess is that you are reading a large query into perl, so the whole
thing will be kept in memory (and you can't use more memory than you
have).  For a large query, this can be a huge amount of memory indeed. 
You could use another language like plpgsql that can support
cursors/looping over query results or, in plperl you could use DBI (not
spi_exec_query) and loop over query results.
Hope this helps,
Sean
On Mar 30, 2005, at 9:33 AM, FERREIRA William (COFRAMI) wrote:
> i have a similar problem
> i'm running PostgreSQL on a PIV with 1GO and Windows 2000 NT
> i have a large database and a big traitment taking more than 4 hours.
> during the first hour postgresql use as much memory as virtual memory
> and i find this strange (growing to more 800MB)
>
> and during the execution i get :
> out of memory
> Failed on request of size 56
> and at the end, postgresql use 300 MB of memory and more than 2GB of
> virtual memory
>
> does this problem can be resolve by tuning postgresql settings ?
> here are my parameters :
> shared_buffers = 1000
> work_mem = 131072
> maintenance_work_mem = 131072
> max_stack_depth = 4096
> i tried work_mem with 512MB and 2MB and i get the same error...
>
> i read all the post, but i don't know how i can configure perl on
> Windows...
>
> thanks in advance
>        
>          Will
>
> -----Message d'origine-----
> De : pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org]De la part de Dan Sugalski
> Envoyé : vendredi 25 mars 2005 19:34
> À : Greg Stark; pgsql-general@postgresql.org
> Objet : Re: [GENERAL] plperl doesn't release memory
>
>
>
> At 6:58 PM -0500 3/24/05, Greg Stark wrote:
> >Dan Sugalski <dan@sidhe.org> writes:
> >
> >>  Anyway, if perl's using its own memory allocator you'll want to
> rebuild it
> >>  to not do that.
> >
> >You would need to do that if you wanted to use a debugging malloc.
> But there's
> >no particular reason to think that you should need to do this just to
> work
> >properly.
> >
> >Two mallocs can work fine alongside each other. They each call mmap
> or sbrk to
> >allocate new pages and they each manage the pages they've received.
> They won't
> >have any idea why the allocator seems to be skipping pages, but they
> should be
> >careful not to touch those pages.
>
> Perl will only use a single allocator, so there's not a huge issue
> there. It's either the external allocator or the internal one, which
> is for the best since you certainly don't want to be handing back
> memory to the wrong allocator. That way lies madness and unpleasant
> core files.
>
> The bigger issue is that perl's memory allocation system, the one you
> get if you build perl with usemymalloc set to yes, never releases
> memory back to the system -- once the internal allocator gets a chunk
> of memory from the system it's held for the duration of the process.
> This is the right answer in many circumstances, and the allocator's
> pretty nicely tuned to perl's normal allocation patterns, it's just
> not really the right thing in a persistent server situation where
> memory usage bounces up and down. It can happen with the system
> allocator too, though it's less likely.
>
> One of those engineering tradeoff things, and not much to be done
> about it really.
> --
>                                 Dan
>
> --------------------------------------it's like this-------------------
> Dan Sugalski                          even samurai
> dan@sidhe.org                         have teddy bears and even
>                                        teddy bears get drunk
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 8: explain analyze is your friend
>
> This mail has originated outside your organization,
> either from an external partner or the Global Internet.
>  Keep this in mind if you answer this message.

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match
This mail has originated outside your organization,
either from an external partner or the Global Internet.
Keep this in mind if you answer this message.


--
                                Dan

--------------------------------------it's like this-------------------
Dan Sugalski                          even samurai
dan@sidhe.org                         have teddy bears and even
                                      teddy bears get drunk

pgsql-general by date:

Previous
From: Greg Stark
Date:
Subject: Re: Debugging deadlocks
Next
From: Tom Lane
Date:
Subject: Re: truncate/create slowness